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ABSTRACT 


The  principal  component  needed  to  model  a  satellite  communicaticms  network  node  is  a  high-speed 
packet  routing  switch.  The  design  and  testing  of  routing  switches  utilizing  Transputer  networks  has 
been  examined  extensively  in  the  literature.  One  such  implementation  by  Khan  and  Ward  made  use 
of  cooperating  processes  scheduled  by  the  Transtech  Genesys  operating  system.  The  performance  of 
this  system  was  unacceptable  with  problems  attributed  to  packet  duplication  and  process  contention.  In 
this  report,  we  examine  an  alternate  approach  using  cooperating  occam  processes  without  the  overhead 
associated  with  an  operating  system. 

The  routing  switch  permits  dynamic  updates  to  its  touting  tables  from  a  host  computer.  Library 
functions  ate  provided  to  host  processes  to  offer  this  service.  However,  accommodating  these  updates 
influences  the  design  of  the  underlying  Transputer  network.  Allowances  must  be  made  not  only  for 
expedient  touting  of  messages  within  the  Transputer  network  but  also  for  rapid  and  efficient  touting  table 
updates  within  the  Transputer  network.  Results  discussed  in  this  report  reflect  the  design  and  performance 
of  the  routing  switch.  Throughput  efficiency  under  a  varictv  of  test  conditions  is  also  provided. 
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1  INTRODUCTION 


Space-based  communication  network  resources  are  too  expensive  for  use  in  experimental  studies,  and 
ground-based  testbeds  are  generally  used  to  provide  a  cost  effective  means  of  studying  them.  One 
such  testbed  was  constructed  to  study  laser  cross-link  targeting  and  packet  routing  in  low  altitude 
multiple  satellite  (LAMS)  networks  (Chow,  Newman-Wolfe,  Ward  and  McLochlin  1988).  The  testbed 
configuration  consists  of  five  fully  connected  nodes  with  each  node  modelling  a  satellite  in  the  LAMS 
network.  Each  satellite  (node)  will  possess  four  bidirectional  laser  transceivers  with  which  to  establish 
cross-link  communicadons,  as  illustrated  in  Hgure  1.  In  the  testbed,  each  node  is  modelled  using  a  PC 
with  addidonal  communicadons  hardware.  The  nodes  communicate  with  one  another  via  bench-mounted 
rctargetable  laser  transceivers. 


Figure  1  Satellite  cauununication  channels. 


Commutucadon  links  between  the  nodes  arc  required  to  support  full-duplex  data  rates  of  at  least 
1  Mbps  to  adequately  model  a  LAMS  cross-link.  Since  a  Host  PC  alone  could  not  satisfy  the  high 
data  rate  requirements,  a  high  speed  peripheral  device  with  a  PC  interface  was  required.  The  INMOS 
T800  Transputer  system  was  evaluated  in  the  original  design  for  its  ability  to  achieve  the  required 
rates.  Transtech’s  Genesys  opetadng  system  (Transtech  Devices  Limited  1990)  and  high-level  language 


compilers,  coupled  with  the  Transputer’s  high  external  data  rates  were  selected  as  the  ideal  combination 
for  providing  the  Host  PC’s  high-speed  data  communications  needs.  Details  of  the  implementation  arc 
provided  in  (Khan  and  Ward  1991;  Ward  and  Khan  1991). 

Low  throughput  rates  using  the  above  system  rendered  it  unsuitable  for  the  project  testbed.  (Ward, 
Khan,  Chow  and  Newman-Wolfe  1992)  Link  perfcwinance  data  (Crowell  1990)  suggests  that  the  Inmos 
physical  and  link  layer  protocols  were  not  respcmsible  for  the  poor  performance.  An  alternative 
explanation  for  the  poor  performance  is  excessive  packet  duplication  required  to  communicate  across 
intermediate  processes  and  process  contention  while  serving  multiple  communication  chaimels.  With  this 
in  mind,  this  paper  evaluates  an  alternate  approach  that  includes  communicating  sequentual  processes 
(CSP)  (Howe  1985)  specihcadons  and  an  occam  implementation  that  avoids  the  overhead  associated 
with  an  operating  system. 

Section  2  reviews  preliminary  design  considerations  relevant  to  new  satellite  model  development. 
Section  3  describes  the  Transputer  configuration  within  the  context  of  the  satellite  model  implementation. 
Section  4  highlights  the  functional  design  of  each  of  the  occam  processes  along  with  enhancements  made 
during  the  development  process.  Throughput  statistics  for  the  preliminary  and  enhanced  versions  of  the 
new  routing  switch  under  a  variety  of  test  conditions  are  presented  in  Section  5.  Comparisons  between  the 
original  GENES YS,  preliminary  occam,  and  enhanced  occam  implementations  are  discussed  in  Section  6. 

2  PRELIMINARY  DESIGN  CONSIDERATIONS 

Satellite  Network  Fundamentals 

Satellite  networks  typically  have  the  following  features:  one  or  more  satellites  placed  in  geosyn¬ 
chronous  orbits;  satellites  used  as  communications  relays;  point-to-point  uplinks,  with  a  common  channel 
shared  by  several  earth  stations  using  a  combination  of  FDMA  and  TDMA;  broadcast  downlinks;  and  long 
propagation  delays  (in  the  order  of  270ms).  Low  Altitude  Multiple  Satellite  (LAMS)  network  systems 
employ  point-to-point  laser  communications,  which  satisfy  precise  targeting,  low  power,  light  weight 
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and  small  size  (so-called  SWAP  -  size,  weight  and  power)  constraints.  (Paul  and  Marshalek  1989)  Such 
systems  have  wide  potential  in  defense  and  space  applications  and  are  characterized  as  follows; 

1.  Multiple  satellites  in  a  low  altitude  oibit  fimetioning  as  store-and-forward  Data  Communica¬ 
tions  Equipment  (DCE). 

2.  Point-to-point  laser  communication  with  very  high  bandwidth  (20  Mbps  -  1  Gbps). 

3.  Long  distance  communication  links  (1,000  km  -  10,000  km  with  relatively  high  bit  error  rate 
(BER,  typically  lO'^  -  10"^). 

4.  Limited  communication  links  per  satellite  due  to  size,  weight  and  power  (SWAP). 

5.  High  mobility. 

The  laser  link  channel  is  characterized  by  random  errors  resulting  from  optical  noise  sources  such  as 
quantum  noise,  preamplifier  thermal  noise,  dark  current  noise,  detect  excess  noise,  and  optical  background 
noise;  and  by  burst  errors  from  beam  mispointing  and  subsequent  tracking  loss  (Paul  and  Marshalek 
1989).  The  number  of  satellites  in  this  environment  is  expected  to  range  from  twenty  to  over  a  hundred. 
Each  satellite  will  be  in  direct  (i.e.  point-to-point)  communication  with  a  few  others.  In  the  model 
currently  under  investigation  this  number  is  constrained  to  four  (i.e.  each  satellite  will  have  four  laser 
transceivers).  A  typical  network  is  illustrated  in  Figure  2.  Additional  details  of  LAMS  networks  may  be 
found  in  (Ward  and  Khan  1991,  Ward  and  Choi  1991). 
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Figure  2  Low  altitude  multiple  satellite  (LAMS)  network. 


l^tbed  Requirements 

One  of  the  many  components  of  a  low  altitude  multiple  satellite  (LAMS)  network  is  a  rctargetable 
communications  laser.  Initial  studies  into  laser  communications,  target  acquisition,  and  routing  that 
models  a  five  node  LAMS  network  configuration  can  be  performed  using  a  ground-based  testbed  Of 
these  five  nodes,  only  three  will  actually  use  laser  trackers  to  communicate  with  each  other,  the  others 
will  be  directly  connected  through  link  simulators.  These  simulators  permit  a  variety  of  link  conditions 
to  be  investigated,  including  bust  errors,  random  errors,  and  link  failure.  The  testbed  model  is  illustrated 
in  Figure  3. 


Figure  3  Testbed  for  five  node  simulation. 


The  directly  connected  links  of  the  satellite  nodes  require  full-duplex  communications  with  data  rates 
of  at  least  I  Mbp)s.  Since  a  host  computer  cannot  satisfy  the  high  data  rate  requirements,  a  high  speed 
peripheral  device  with  a  PC  interface,  such  as  a  Transputer,  is  needed. 

TVansputer  Architecture  Fundamentals 

A  Transputer  is  a  complete  computer  on  a  chip.  The  Transputer  used  in  this  project  (an  INMOS  T800) 
is  composed  of  a  32  bit  processor  capable  of  10  million  instructions  per  second,  a  hardware  floating  point 
unit,  4  Kbytes  of  very  fast  static  RAM,  l  programmable  memory  interface  (which  allows  up  to  4  Gbytes 
physical  memory  external  to  the  Transputer)  and  four  bidirectional  communication  links  capable  of  rates 
up  to  20  Mbps.  Each  communication  link  is  implemented  as  an  autotx)mous  DMA  (Direct  Memory 


Access)  engine  so  that  it  can  petform  communications  with  external  devices  as  background  tasks  tu  the 
processor  with  negligible  performance  degradation.  This  architecture  is  reflected  in  Figure  4. 


Figure  4  Block  diagram  of  the  32'bit  Trairsputer. 


of  T800  conflgurations  is  constrained  by  the  fixed  communication  links,  allowing 
selecting  available  soft  link  cotuiections.  The  on-site  hardware  is  an  MCPIOOO  system 
with  four  sites  (or  rows)  by  four  slots  (or  columns)  filled,  for  a  total  of  sixteen  Transputers.  Fixed 
link  connections  are  installed  between  consecutive  Transputers  in  the  same  site  to  connect  them  together 
in  series.  The  two  remaining  links  on  each  transputer  are  cormected  to  a  software-controlled  crossbar 
switch.  These  “link  switches"  enable  the  user  to  connect  any  link  to  any  other  link  that  is  connected 
to  any  switch  on  the  board. 


The  topology 
flexibility  only  in 
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The  Occam  programming  language  was  designed  to  simplify  the  task  of  concurrent  programming  on 
netwroks  of  INMOS  transputers.  An  occam  program  is  made  up  of  a  number  of  processes  which  can  be 
declared  to  run  sequentially  or  concurrently.  Concurrent  processes,  which  caimot  use  shared  resources, 
communicate  across  occam  channels.  These  channels  are  single  direction,  point  to  point  connections 
between  proc'^sses  that  provide  synchronized  message  communicadcm. 

Transputers  are  equiped  with  hardware  to  support  concurrency  and  message  passing  via  occam  channels. 
A  collection  of  concurrently  executing  occam  processes  can  be  directly  manped  onto  either  one  transputer, 
which  shares  its  time  between  them,  or  onto  multiple  transputers,  each  taking  a  subset  of  the  processes. 

Synchronized  communications  between  INMOS  Transputers  use  the  TRAM  link  protocol  shown  in 
Rgure  5.  The  data  format  is  as  follows:  Each  byte  is  transmitted  as  a  start  bit,  followed  by  a  one  bit, 
eight  data  bits  and  a  stop  bit 


Data  Packet 


start  data  data  stop 

bit  bit  bit 


Acknowiedge 


Figure  S  Link  protocol 


After  transmitting  a  data  byte,  the  sender  waits  until  an  acknowledge  is  received,  consist  ng  of  a 
start  bit  followed  by  a  zero  bit  The  acknowledge  signifies  both  that  a  process  was  able  to  reci  e  the 
acknowledged  byte,  and  that  the  receiving  link  is  able  to  receive  another  byte.  Acknowledges  mi  not 
be  sent  in  advance.  The  receiving  end  starts  with  an  empty  buffer,  ready  to  receive  the  first  byte,  he 
sending  link  reschedules  the  sending  process  only  after  the  acknowledge  for  the  final  byte  of  the  message 
has  been  received. 


A  network  of  INMOS  T800  Transputers  can  satisfy  the  host  computer’s  high  speed  data  communica¬ 
tions  needs  because  it  has  an  inherent  capability  for  high  speed,  full  duplex,  synchrorK>us  communications, 
and  also  because  high  level  language  compilers  (such  as  occam)  are  available  for  Transputer  programming. 


Original  Simulation  Results 


The  Genesys  and  C  implementation  found  in  (Khan  and  Ward  1991;  Ward  and  Khan  1991)  provides 
the  background  for  the  new  work-  The  general  hardware  and  software  architectures  were  maintained  in 
the  new  design  to  permit  meaningful  comparisons  with  the  results  of  the  original  implementation.  These 
architectures  are  presented  in  Sectiem  3.  Genesys  and  C  implementation  test  results  ate  provided  for 
comparison  with  the  results  presented  in  Section  5.  The  following  sections  describe  each  of  the  tests 
performed,  their  respective  configurations  and  the  recorded  results.  An  attempt  is  also  made  to  justify 
the  selection  of  these  tests. 


Test  0:  Maximum  Link  Throughput  The  Transputer  links  are  reported  to  function  at  10  or  20 
Mbps,  as  selected  by  the  user.  Their  throughput  was  recalculated  using  the  Genesys  supplied  datalink 
layer  functions  so  that  the  throughput  results  obtained  from  the  routing  switch  tests  could  be  consistently 
compared  to  the  maximum  data  rate.  The  maximum  data  rate  for  the  Transputer  links  was  computed  by 
sending  messages  from  one  TRAM  to  an  adjacent  TRAM  over  a  specified  link.  The  source  transmitted 
SOOO  1002 -byte  packets  at  constant  packet  delay  intervals.  The  maximum  throughput  was  calculated 
by  decreasing  the  packet  transmitting  delay  cm  successive  runs  of  the  test.  The  received  messages 
were  timed  and  the  throughput  was  calculated.  The  maximum  throughput  recorded  from  the  test  was 
Amax  =  0.707  Mbps. 

Although  the  maximum  data  rate  of  the  Transputer  links  is  ea^  to  calculate,  the  accumulation  of 
meaningful  performance  results  for  the  Transputer  based  router  is  more  complicated.  This  difficulty  stems 
from  the  fact  that  there  ate  numerous  modes  of  operation  in  which  these  results  could  be  accumulated. 
Because  some  of  these  modes  of  operation  yield  redundant  information,  tests  were  performed  on 
strategically  identified  configurations.  These  cases  were  chosen  because  they  would  yield  valuable 
information  in  determirting  the  routing  switch’s  performance  in  its  original  testbed  environment. 


Test  1;  Maxunum  Switch  Throughput  The  first  test  involves  recording  the  throughput  of  the 
packet  routing  switch  under  ideal  conditions.  The  purpose  of  this  lest  is  to  compute  the  maximum 
possible  data  rate  at  which  the  switch  can  operate.  These  ideal  conditions  are  defined  as  when  the 
message  traffic  is  one-way  along  a  link  and  takes  the  shortest  path  through  the  switch  from  a  source 
(src)  to  a  sink  (snk). 

The  generator  transmitted  SOOO  1002-byte  packets  size  at  constant  packet  delay  intervals.  The 
maximum  throughput  was  calculated  by  decreasing  the  packet  transmitting  delay  on  successive  tuns 
of  the  test  The  maximum  throughput  recorded  from  the  test  was  A  =  0.531  Mbps  and  the  throughput 
efficiency  (relative  to  the  maximum  link  throughput)  was  Xefj  =  75.06%. 

Test  2:  Congested  External  Link  The  worst  possible  performance  of  the  packet  routing  switch  is 
expected  to  take  place  when  all  messages  entering  the  switch  are  bound  for  the  same  external  link. 
Gsngestion  may  result  at  the  external  link  TRAM,  forcing  the  whole  Trarvsputer  network  to  slow  down. 
This  condition  is  simulated  by  touting  the  traffic  from  three  sources  to  the  same  sink. 

The  generator  transmitted  S0(X)  1002-byte  packets  at  ccmstant  packet  delay  intervals.  The  maximum 
throughput  was  calculated  by  decreasing  the  packet  transmitting  delay  on  successive  runs  of  the  test  The 
maximum  throughput  recorded  from  the  test  was  A  =  0.568  Mbps  and  the  throughput  efficiency  was 
Ae/ /  =  80.33%.  The  originally  proposed  reason  for  the  increased  throughput,  in  lieu  of  the  expected 
decrease,  is  that  the  transputer’s  link  utilization  beomes  somewhat  more  efficient  under  heavier  loads. 
Based  on  subsequent  tests  of  the  enhanced  design,  the  proposed  cause  is  limited  parallelism  in  the 
communication  channels  provided  by  the  design. 

Test  3:  Impact  of  Dynamic  Routing  Table  Updates  The  most  interesting  feature  of  the  packet 
routing  switch  is  its  dynamic  routing  table  update  capability.  The  effects  of  dynamically  loading  routing 
table  updates  into  the  Transputer  network  during  standard  operation  are  of  particular  concern.  Since  the 
throughput  will  be  directly  affected  by  these  updates,  this  test  was  performed  to  measure  the  throughput 


as  the  update  frequency  was  increased.  For  the  sake  of  clarity  the  traffic  patterns  are  kept  constant  in 
the  switch  while  the  throughput  was  recorded  . 

To  analyze  the  effects  of  the  routing  table  update  injection  the  maximum  throughput  of  the  switch  was 
first  recorded  prior  to  introducing  the  updates.  The  source  transmitted  5000  1002-byte  packets  at  constant 
packet  delay  intervals.  The  maximum  throughput  was  calculated  by  decreasing  the  packet  transmitting 
delay  on  successive  runs  of  the  test  For  packets  routed  through  the  bridge,  the  maximum  throughput 
recorded  was  A  =  0.406  Mbps  and  the  t’  oughput  efficiency  was  =  57.40%.  This  contrasts  with 
packets  not  routed  through  the  bridge  (A  =  0.407  Mbps  and  Xgjf  =  57.54%  respectively).  The  bridge 
used  in  this  project  provides  only  the  service  access  point  for  host  data  communications  onto  the  router 
network.  Protocol  conversions  are  not  performed.  Additional  information  on  the  bridges  functionality 
is  presented  in  Section  4. 

Once  the  maximum  operating  throughput  of  the  switch  was  recorded,  the  switch  was  driven  at  this 
constant  throughput  and  routing  table  updates  were  introduced.  The  routing  table  updates  used  did  not 
reflect  changes  of  external  link  connected  nodes,  rather  they  reinitialized  the  external  routing  tables  to 
their  original  value  so  that  the  throughput  deterioration  obtained  could  be  compared  to  the  maximum 
throughput  obtained  without  the  routing  table  updates.  This  deterioration  demonstrates  the  overhead 
required  to  update  external  routing  tables  throughout  the  switch. 

The  update  source  used  in  the  test  was  similar  to  the  packet  generators  in  capability.  The  only 
difference  was  that  it  distributed  routing  table  update  IDUs  from  the  host.  The  update  source  sent  five 
12-byte  routing  table  update  IDUs,  each  destined  for  a  separate  TRAM  in  the  network.  These  update 
IDUs  were  delivered  one  after  another  without  using  any  inter-departure  delays.  Once  all  five  updates 
were  injected  into  the  Transputer  network  the  update  source  waited  for  the  inter-transmission  delay  and 
then  repeated  the  same  operation.  Updates  were  generated  throughout  the  duration  of  the  test 

The  same  source  characteristics  were  used  to  drive  the  switch  again,  except  both  sources  were  driven 
at  the  maximum  allowable  throughputs,  as  recorded  earlier.  The  maximum  throughputs  recorded  from  the 


test  were  the  same  as  without  routing  updates  and  the  effect  on  the  corresponding  throughput  efficiencies 
was  ccMisddered  negligible. 

Tests  4  and  5:  General  Switch  Operation  Hie  last  two  tests  were  conducted  to  generate  data 
which  would  reflect  the  packet  routing  switch’s  general  operational  characteristics.  In  these  cases,  random 
desdnadcMi  traffic  would  be  passing  through  the  switch  with  varying  packet  sizes  and  varying  inter-arrival 
times.  Since  we  are  interested  in  the  switch’s  general  operation  in  the  project  testbed,  packet  sizes  and 
inter-arrival  times  were  matched  to  the  expected  traffic  in  the  testbed. 

Test  4  was  designed  to  analyze  throughput  of  the  switch  with  random  destination  packets  passing 
through  the  switch.  Sources  and  sinks  were  connected  to  all  the  external  links.  An  additional  sink  was 
included  to  absorb  packets  with  destinations  set  for  the  packet  routing  switch’s  host  node. 

The  generator  used  a  normally  distributed  packet  size  and  a  normally  distributed  packet  delay  with  a 
uniformly  distributed  random  packet  destination  field.  The  maximum  aggregate  throughput  was  calculated 
for  each  sink  by  decreasing  the  packet  transmitting  delay  on  successive  runs  of  the  test  The  maximum 
throughputs  recorded  from  the  test  were: 

SINK  0  A  =  0.121  Mbps 
SINKl  A=  0.116  Af  ftps 

SINK  2  A=  0.110  Mftps  (1) 

SINKS  A=  0.118  Mftps 
SINK  4  A=  0.116  Mftps 

To  analyze  the  general  operation  of  the  switch,  random  traflic  patterns  were  used  in  Test  S.  The 
configuration  used  for  Test  5  was  identical  to  Test  4  except  an  additional  source  was  included  to  model 
traffic  originating  from  the  host  node. 

The  generator  used  a  normally  distributed  packet  size  and  a  normally  distributed  packet  delay  with 
a  uniformly  distributed  random  packet  destinaticm  field  .  The  maximum  aggregate  throughput  was 


calculated  for  each  sink  by  decreasing  the  packet  transmitting  delay  on  su«xssive  runs  of  the  test 


The  maximum  throughputs  recorded  from  the  test  were: 


SINK  0  A  =  0.068  Mbps 
SINK  1  A  =  0.063  Mbps 
SINK  2  A=  Om&Mbps 
SINK  3  A=  0.064  Mips 
SINK  4  A=  0.066 


(2) 


Table  I  provides  a  comprehensive  listing  of  the  results  obtained  from  the  tests  performed.  Aqgg.^>.  in 
Table  1  corresponds  to  the  sum  of  all  the  maximum  throughputs  achieved  at  all  sinks  during  a  test  This 
metric  provides  a  valuable  insight  into  the  functioning  of  the  packet  routing  switch  when  it  is  viewed  as 
a  black  box  device.  By  comparing  the  aggregate  throughput  for  different  test  cases,  valid  comparison 
for  the  throughput  of  the  switch  can  be  made. 
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Transputer  Link  0.707  Mbps 
Throughput 


0.531  Mbps 


0.568  Mbps 


0.812  Mbps 


Test  4 


0.581  Mbps 


Direct  Traffic  (no 
updates) 


Direct  Traffic  (w/ 
updates) 


Through  Bridge  (no  A 
updates) 


Through  Bridge  (w/  A 
updates) 


Sink  0 


Sink  1 


Sink  2 


Sink  3 


Sink  4 


A  *  0.531  Mbps 


A  >=  0.568  Mbps 


A  =  0.407  Mbps 


A  *  0.407  Mbps 


A  =  0.406  Mbps 


A  *  0.406  Mbps 


A  =  0.121  Mbps 


A  =  0.1 16  Mbps 


A  *  0.1 10  Mbps 


A  =  0.1 18  Mbps 


Arelatlve 


N/A 


Aeff  =  75.06% 


Aeff  -  80.33% 


Aeff  =  57.54% 


% 


Aeff  -  57.40% 


% 


Test  5 


0.327  Mbps 


Sink  0 

A  -  0.068  Mbps 

N/A 

Sink  1 

A  =  0.063  Mbps 

N/A 

Sink  2 

A  •=  0.066  Mbps 

N/A 

Sink  3 

A  »  0.064  Mbps 

N/A 

Sink  4 

A  =  0.066  Mbps 

N/A 

Table  1  Cumulative  results 

3  TRANSPUTER  NETWORK  DESIGN 

The  new  design  is  illustrated  in  Figure  6.  This  black  box  design  provides  for  optimal  routing  of  data 
traffic.  The  required  high-speed  channels  are  provided  for  the  host  link  and  each  of  the  four  external 
(source/sink)  links.  A  separate  router  is  dedicated  to  each  external  link  to  provide  optimal  handling  of 
arriving  message  traffic.  Furthermore,  two  router/driver  pairs  are  placed  on  a  single  Transputer  site  to 
optimize  commuriications  using  the  faster  hard  links.  Routets  are  connected  to  provide  fully  parallel 
channels  of  communication  between  each  distinct  pair  of  routers. 
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SLOT  0  SLOT  1  SLOT  2  SLOT  3 


SITEO 


SITE1 


SITE  2 


Hard  Link  Correction 
Soft  Link  Connection 

Figure  6  Tmn^uter  architecture. 


To  provide  a  clearer  understanding  of  the  overall  models  functionality,  the  corresponding  software 
process  names  are  shown  on  their  respectively  assigned  transputer  in  Hgure  6.  This  also  ensures  that 
the  software  and  hardware  configurations  match.  Because  communication  channels  between  processes  on 
separate  transputers  are  mapped  directly  onto  hardware  links  by  the  Transputer  boot  loader,  the  software 
and  hardware  configurations  must  match  to  prevent  a  load  failure.  The  mapinng  specifications  are  declared 
in  an  accompanying  OCCONF  configuration  program  to  provide  for  the  parallel  execution  of  the  host, 
bridge,  router,  and  driver  processes  on  their  respective  assigned  processors. 

4  PROCESS  DESIGN 

Preliminary  Implementation 

The  Communicating  Sequential  Processes  (CSP)  (Hoare  1985)  specificatiem  methodology  provides 
the  necessary  means  for  specifying  concurrent  Occam  processes.  It  also  supports  the  refinement  of 
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specifications  through  multiple  levels  of  abstraction  to  enhance  reasoning  about  inter-process  behavior. 
This  understanding  helps  to  ensure  that  the  correct  functionality  is  achieved  and  provides  confidence 
that  deadlock  will  not  occur.  In  addition,  well-defined  CSP  specifications  are  readily  refined  into  occam 
code  which  can  subsequently  be  executed  on  a  Transputer  system  with  minimal  debugging.  The  occam 
processes  described  in  the  following  paragraphs  were  directly  refined  from  the  CSP  specifications  provided 
in  Appendix  A. 

The  host  process  drives  the  host  communications  link,  as  shown  in  Figure  7.  In  accordance  with  occam 
programming  convoitions,  it  serves  as  the  single  service  access  point  for  all  host  PC  I/O.  Its  primaty 
function  is  to  introduce  data  packets  and  routing  table  updates.  It  is  also  responsible  for  overall  control  of 
process  execution  to  provide  for  the  synchronization  of  the  driver  processes  at  each  stage  during  a  test  A 
single  stage  for  example,  might  record  the  throughput  of  5000  1002-byte  data  packets  with  a  mean  inter¬ 
packet  arrival  time  of  .010  seconds.  After  re-synchronization,  the  next  stage  would  begin  with  packet 
transmissions  distributed  with  a  mean  arrival  time  of  .009  seconds,  etc.  Occam  discriminated  channel 
protocol  signals  arc  used  to  provide  re-synchronization  at  each  new  stage  and  the  clean  termination  of 
all  parallel  processes  upon  test  completion. 

GDP  •  Generate  Data  Packets 
GRL  -  Generate  Routing  Updates 
MHO  -  Multiplex  Host  CXitputs 
RHI  •  Receive  Host  Inputs 

ho.ot  -  channel  (from  host) 
ho.in  -  input  channel  (to  host) 

i.c#  -  internal  channel  number 

-  -  -  control  signal  traffic  only 

— ►  -  control  and  data  traffic 

Figure  7  Host  Communjcations  IHow  Diagram 
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The  bridge  process  serves  as  the  access  point  for  the  placement  and  extraction  of  host  traffic  to  and 
from  the  router  communications  network,  as  shown  in  Figure  8.  Other  message  traffic  on  the  router 
network  which  enters  the  bridge  is  re-traasmitted  to  the  corresponding  router  on  the  opposite  side  of 
the  bridge.  Control  signals  on  the  router  network  which  enter  the  bridge  are  captured  and  forwarded 
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to  the  host  pnx;ess.  Control  signals  from  the  host  are  transmitted  to  both  routers  directly  connected  to 
the  bridge.  These  routers  are  subsequently  responsible  for  forwarding  appropriate  host  signals  to  their 
assigned  partners.  For  performance  reasons,  each  of  the  two  routers  not  directly  connected  to  the  bridge 
were  coupled  with  the  directly  connected  router  sharing  the  same  Transputer  site. 


1 


\ 


MHO  -  Multiplex  Host  Outputs 
DHI  -  Demultiplex  Host  Inputs 
MRO  •  Multiplex  Router  Outputs 

ho.ot  <•  output  channel  (to  host) 
hn.lr  »  inpi.it  chanrtel  (from  host) 
rl.ot  <•  output  channel  (toroutBr2) 
r1  .in  •  input  channel  (from  routsr2) 
r2.ot  •  output  channel  (to  routerO) 
r2.in  -  input  channel  (from  routerO) 

i.c#  •  internal  channel  # 

-  control  signal  traffic  only 
— ►  »  control  and  data  traffic 


Figure  8  Bridge  Communications  Flow  Diagram 


Router  processes,  as  shown  in  Figure  9,  direct  message  traffic  flow  around  the  router  network.  Since 
fully  connected  routers  are  used,  every  external  data  message  must  pass  through  exactly  two  routers. 
This  greatly  simplifies  the  routing  process  since  any  external  data  message  arriving  on  any  channel  from 
another  router  can  be  directly  transferred  to  its  associated  driver  (sink)  process.  Routing  table  lookups 
are  only  required  when  an  external  message  is  received  from  the  driver  (source)  or  an  internal  message  is 
received  from  the  host  (source)  to  ensure  the  message  is  forwarded  to  the  appropriate  destination  router. 
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MDO  -  Multiplex  Driver  Outputs 
DOI  -  Demultiplex  Driver  lr\puts 
MHO  -  Multiplex  Host  Outputs 
DHI  -  Demultiplex  Host  Inputs 
MRO  •  Multiplex  Router  Outputs 
DRI  -  Demultiplex  Router  Intputs 

dr.ot  -  output  channel  (to  driver) 
dr.ln  -  input  channel  (from  driver) 
ho.ot  -  output  channel  (to  host  route) 
ho.in  -  input  chanrtel  (from  host  route) 
rt.ot  -  output  chartnel  (to  partner  route) 
rl.in  input  channel  (from  partner  route) 
r2.ot  -  output  charmel  (to  other  route) 
r2.in  -  input  channel  (from  other  route) 

i.off  <•  internal  channel  number 

-  -  -  control  signal  traffic  only 

— ►  -  control  arxJ  data  traffic 


Figure  9  Router  Communications  Flow  Diagram 


Drivers  perform  two  primaty  functions,  as  shown  in  Figure  10.  Rrst,  they  serve  as  the  source  for  data 
message  trafRc  on  the  router  network.  Second,  they  act  as  a  sink  for  all  message  traffic  arriving  from  the 
router  network.  Inter-packet  arrival  data  is  accumulated  for  post-test  result  calculations. 


SDO  •  Send  Driver  CXrtputs 
RDI  •  Receive  Driver  Inputs 

dr.ot  -  output  channel  (from  driver) 
dr.  in  -  Input  channel  (to  driver) 

i.c#  -  internal  channel  number 

-  -  ♦  -  control  signal  traffic  only 
— ►  -  control  and  data  traffic 


Figure  10  Driver  Communications  Flow  Diagram 


To  eliminate  any  potential  impact  on  test  results,  all  distributed  values  were  calculated  in  advance  and 
stored  in  arrays  for  easy  retrieval  during  the  appropriate  test  stage.  External  link  driveis,  serving  in  their 
capacity  as  sinks,  also  store  only  inter-packet  arrival  results.  All  results  are  retrieved  and  calculations  are 
performed  at  the  completion  of  each  stage  of  message  routing  tests.  No  host  PC  I/O  is  performed  during 
the  testing  process.  These  measures  ensure  the  accuracy  of  routing  throughput  test  results. 
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Enhanced  Implementation 


Results  from  the  preliminary  design  tests  were  promising.  The  throughput  speeds  achieved  provided 
conclusive  proof  of  the  TSOO’s  suitability  for  the  project  testbed.  However,  two  observations  from  the 
preliminaiy  design  tests  provided  reasons  for  ccMicem.  The  first  observation  was  encountered  during  the 
second  part  of  test  P3.  During  this  test,  a  livelock  condition  occurred  when  the  arrival  times  between 
successive  routing  update  IDUs  dropped  below  30  ticks  (or  CPU  clock  cycles).  The  other  observation 
occurred  during  test  P5  when  the  system  became  deadlocked  as  inter-packet  arrival  times  dropped  below 
SO  ticks. 

The  routing  update  problem  was  located  within  the  multiplex  host  output  (MHO)  process  shown  in 
Figure  7.  The  occam  ALT  construct  is  used  in  this  process,  so  that  when  a  single  channel  is  ready 
to  transmit,  that  channel  is  served,  while  if  more  than  one  channel  is  ready  to  transmit,  the  ALT 
indiscriminately  chooses  which  one  to  serve.  The  implementation  of  the  ALT  statement  checks  the 
channels  in  sequential  order  as  listed  within  the  ALT  statement  body,  with  each  new  pass  starting  at  the 
top  of  the  list.  Thus,  when  the  routing  update  channel  was  listed  first  and  updates  arrived  at  a  high  enough 
rate,  the  updates  consumed  all  the  time  slices  so  that  no  other  communications  were  serviced.  The  livelock 
condition  was  eliminated  by  modifying  the  channel  list  so  that  the  routing  update  channel  appeared  last. 

The  deadlock  situation  encountered  in  test  P5  was  more  serious.  By  examining  the  preliminary  CSP 
specification  and  the  steps  in  the  refinement  process,  it  became  apparent  that  measures  taken  to  reduce 
process  contention  and  packet  duplication  were  at  the  root  of  the  problem.  These  measures  were  designed 
to  reduce  internal  communications  to  an  absolute  minimum.  To  accomplish  this,  ALT  processes  were 
used  to  multiplex  appropriate  input  channels  onto  the  respective  output  channels  within  the  bridge  and 
router  processes,  as  shown  in  the  “before"  sections  of  Figures  1 1  and  12.  Only  the  host  and  driver  input 
and  output  channels  were  separated  into  distinct  parallel  processes. 


MHO  •  Multiplex  Host  Outputs  ho.ot  -  output  channel  (to  host) 

DHI  -  Demultiplex  Host  Inputs  ho.in  -  input  channel  (from  host) 

MHO  -  Multiplex  Router  Outputs  r1  .ot  -  output  channel  (to  router2) 

DRI  -  Demultiplex  Router  Inputs  r1.in  -  input  channel  (from  router2) 

r2.ot  -  output  channel  (to  routerO) 

-  -  *•  »  control  signal  traffic  only  r2.in  -  input  channel  (from  routerO) 

— ►  -  control  and  data  traffic 

i.c#  -  internal  channel  # 

Figure  11  Bridge  Communications  Flow  Enhancements 
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before 


AFTER 


MDO  -  Multiplex  Driver  Outputs 
DDI  •  Demultiplex  Driver  Inputs 
MHO  -  Multiplex  Host  Outputs 
DHI  •  Demultiplex  Host  Inputs 
MRO  -  Multiplex  Router  Outputs 
DRI  •  Demultiplex  Router  Intputs 


--*■  m  control  signal  traffic  only 
— ►  -  control  and  data  traffic 


dr.ot  -  output  channel  (to  driver) 
dr  .in  -  input  channel  (from  driver) 
ho.ot  -  output  chai4nel  (to  host  route) 
ho.  in  -  input  channel  (f'om  host  route) 
r1  .ot  -  output  channel  (to  partner  route) 
rl.in  -  input  channel  (from  partner  route) 
r2.ot  -  output  channel  (to  other  route) 
r2.in  -  input  channel  (from  other  route) 

i.c#  -  internal  channel  number 


Figure  12  Router  Conununications  Flow  Enhancements 


The  problem  with  this  approach  is  that  the  parallel  lines  of  communication  at  the  bridge  and  routers 
were  coupled  together  too  tightly.  At  the  bridge,  traffic  between  connected  routers  was  coupled  to  the 
host’s  outbound  traffic.  And,  at  router  0  and  router  2,  host  communications  to  driver  1  and  driver  3  were 
coupled  to  driver  0  and  driver  2  communications,  respectively.  Deadlock  could  occur  if  four  synchronized 
data  packets  were  transmitted;  from  the  host  to  driver  1  and  driver  3,  from  driver  0  to  driver  2,  and  from 
driver  2  to  driver  0.  Under  these  circumstances,  driver  0  and  driver  2  deadlocked  because  the  bridge 
had  blocked  to  serve  the  host  communications.  The  host  deadlocked  because  router  0  and  router  2  had 
blocked  to  serve  the  driver  0  and  driver  2  communications,  respectively.  Eventually,  the  remainder  of 
the  router  network  filled  up  with  blocked  traffic,  leading  to  complete  system  deadlock. 


To  solve  the  deadlock  problem,  all  intxsund  and  outbound  channels  at  the  bridge  and  routers  were 
decoupled  as  shown  in  the  “after"  sections  of  Figures  1 1  and  12  for  the  enhanced  version  of  the  satellite 
model.  During  the  refinement  process,  an  interesdng  observation  was  made.  The  effect  of  the  refinements 
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was  to  restrict  each  ALT  process  in  the  model  to  either  a  single  input  or  a  single  output  channel  for  data 
ttaflic  communications,  eliminating  the  deadlock  c<»Klition,  but  also,  by  opening  up  parallel  channels  of 
communication,  significantly  increasing  all  system  throughput  rates. 

5  SIMULATION  RESULTS 

Preliminary  Implementation 

Test  PO:  Maximiim  Link  Throughput  To  ascertain  the  raw  data  rate  of  the  configured  system,  the 
throughput  of  a  single  Transputer  link  was  measured.  The  measured  value  was  calculated  by  configuring 
the  Host  (sink)  and  Driverl  (source)  processes  cm  a  directly  connected  hard  link.  The  source  used  a 
constant  1002 -byte  packet  size  and  a  constant  packet  delay.  The  maximum  throughput  was  calculated 
by  decreasing  the  packet  transmitting  delay  on  successive  runs  of  the  test  The  maximum  throughput 
recorded  from  the  test  was  A  max  =  13.464  Mbps. 

Test  PI:  Maximum  Switch  Throughput  The  first  test  involves  recording  the  throughput  of  the 
packet  touting  switch  under  ideal  conditions.  The  maximum  throughput  recorded  from  the  test  was  A  = 
6.811  Mbps  and  the  throughput  efficiency  (relative  to  the  maximum  link  throughput)  was  Ag/ /  =  50.59%. 

Test  P2:  Congested  iTxtemal  Link  The  worst  possible  performance  of  the  packet  routing  switch 
is  conceived  as  when  all  messages  entering  the  switch  are  bound  for  the  same  external  link.  Congestion 
may  result  at  the  external  link  TRAM  forcing  the  whole  Transputer  network  to  slow  down.  The 
maximum  throughput  recorded  from  the  test  was  A  =  7.967  Mbps  and  the  throughput  efficiency  was 
X,fj  =  59.17%. 

Test  P3:  Impact  of  Dynamic  Routing  Table  Updates  To  analyze  the  effects  of  the  routing  table 
update  injection,  the  maximum  throughput  of  the  switch  was  first  recorded  prior  to  introducing  the 
updates.  For  packets  routed  through  the  bridge,  the  maximum  throughput  recorded  was  A  =  3.700  Mbps 
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and  the  throughput  efliciency  was  Xejf  =  27.48%.  This  contrasts  with  packets  nert  routed  through  the 
bridge  (A  =  4.800  Mbps  and  Xef  j  =  35.65%  respectively). 


Once  the  maximum  operating  throughput  of  the  switch  was  recorded,  the  switch  was  driven  at  this 
cemstant  throughput  and  routing  table  updates  were  introduced.  This  test  demonstrates  the  oveihead 
required  to  update  external  routing  tables  throughout  the  switch.  For  packets  passing  through  the  bridge, 
the  maximum  throughput  recorded  from  the  test  was  A  =  3.504  Mbps  and  the  throughput  efliciency 
was  Xdet  =  5.30%.  This  contrasts  with  packets  routed  through  the  bridge  (A  =  4.740  Mbps  and 
A<ie<  =  1.25%  respectively). 


Tests  P4  and  P5:  General  Switch  Operation  The  last  two  tests  were  conducted  to  generate  data 
which  would  reflect  the  packet  routing  switch’s  general  operational  characteristics.  Figure  13  shows  a 
plot  of  the  throughput  versus  the  packet  transmissiem  delay.  The  maximum  throughputs  recorded  from 
the  test  were: 


HOST  A  =  3.913  Mbps 
DRIVER  0  X  =  2.776  Mbps 
DRIVER  1  X  =  2.776  Mbps 
DRIVER  2  A  =  2.806  Mbps 


(3) 


DRIVER  3  A  =  2.843  Mbps 


Figure  13  Test  P4:  Switch  general  operation  throughput  plot.  (Preliminaiy  Design) 


The  ccmfiguration  used  for  Test  PS  was  identical  to  Test  P4  with  the  inclusion  of  message  traffic  from 
the  host.  Figure  14  shows  a  plot  of  the  throughput  versus  the  packet  transmission  delay.  The  maximum 
throughputs  recorded  from  the  test  were: 
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HOST 


A  =  2.152  Mbps 


f  aoooooo.o 


aoooooo.o 


DRIVER  0  X  =  2.mMbps 
DRIVER  1  A  =  2.063M6pa 
DRIVER  2  A  =  2.049  Mbps 
DRIVERS  A  =  2.091  Mbps 


0.0000  0.0010  0.0020  O.OGQO  0.0040  0.0050  0.0000  0.0070  OOOOO  0.0000  0.0100  0.0110  0.0120 
Maw  PacM  OwwaDon  (May  (Nonaal  DMMon)  (a) 

Figure  14  Test  PS;  Switch  general  operation  throughput  plot.  (Preliminary  Design) 


The  results  obtained  from  this  simulation  are  summarized  in  Table  2  which  includes  the  aggregate  data 


rate  for  the  routing  switch  taken  as  a  whole. 
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Itet 


Transputer  Link  13.464  Mbps 
Throughput 


1  6.811Mbps 


7.967  Mbps 


8.500  Mbps 


Direct  Traffic  (no 
updates) 


Direct  Traffic  (w/ 
updates) 


Through  Bridge  (no 
updates) 


Through  Bridge  (w/ 
updates) 


Host 


Driver  0 


Driver  1 


Driver  2 


Driver  3 


Mbps 


A  »  6.811  Mbps 


X  •  7.967  Mbps 


A  *  4.800  Mbps 


A  »  4.740  Mbps 


A  -  3.700  Mbps 


A  •  3.504  Ml^ 


A  *  3.913  Mbps 


A  »  2.776  Mbps 


A  »  2.776  Mbps 


A  =  2.806  Mbps 


A  =  2.843  Mbps 


Aeff  *  50.59% 


Aeff  -  59.17% 


Aeff  *  35.65% 


de«  “  5.30% 


10.524  Mbps 


Host 

A  *  2.152  Mbps 

N/A 

Driver  0 

A  •  2.169  Mbps 

N/A 

Driver  1 

A  -  2.063  Mbps 

N/A 

Driver  2 

A  *  2.049  Mbps 

N/A 

Driver  3 

A  =  2.091  Mbps 

N/A 

Table  2  Cumulative  results 

Enhanced  Implementation 

Test  £0:  Maxinium  link  Throughput  To  ascertain  the  raw  data  rate  of  the  enhanced  system,  the 
throughput  of  a  single  Transputer  link  was  again  measured  by  configuring  the  Host  (sink)  and  Driver! 
(source)  processes  on  a  directly  connected  hard  link.  Figure  1 5  shows  a  plot  of  the  throughput  versus  the 
packet  transmission  delay.  The  maximum  throughput  recorded  from  the  test  was  Amar=  13.468  Mbps, 
a  slight  improvement  of  about  0.004  Mbps  over  the  preliminary  implementation. 


I 


Figure  IS  Test  EO:  TRAM  Link  throughput  plot. 


Test  El:  Maxiinum  Switch  Throughput  The  first  test  involves  recording  the  throughput  of  the 
packet  routing  switch  under  ideal  conditions.  Figure  16  shows  a  plot  of  the  throughput  versus  the  packet 
transmission  delay.  The  maximum  throughput  recorded  from  the  lest  was  A  =  11.454  Mbps  and  the 
throughput  efficiency  (relative  to  the  maximum  link  throughput)  was  Ae/ /  =  85.05%.  This  is  slightly 
more  than  a  3S%  improvement  over  the  preliminary  implementation. 


Figure  16  Test  El:  Manmum  switch  throughput  plot. 


Test  £2:  Congested  External  Link  The  woist  possible  performance  of  the  packet  routing  switch 
is  conceived  as  when  all  messages  entering  the  switch  are  bound  for  the  same  external  link.  Congestion 
may  result  at  the  external  link  TRAM  forcing  the  whole  Transputer  network  to  slow  down.  Figure  17 
shows  a  plot  of  the  throughput  versus  the  packet  transmission  delay.  The  maximum  throughput  recorded 
from  tJie  test  was  A  =  7.856  Mbps  and  the  throughput  efficiency  was  Xgf  /  =  58.33%,  a  decrease  of 
about  .110  Mbps  and  less  than  1%  from  the  preliminary  implementation. 


} 


I 


» 


Figure  17  Test  E2;  Congested  external  link  throughput  plot. 


Test  £3:  Impact  of  Dynamic  Routing  llible  Updates  To  analyze  the  effects  of  the  routing  table 
update  injecdcHi,  the  maximum  throughput  of  the  switch  was  first  recorded  prior  to  introducing  the 
updates.  Hgure  18  shows  a  plot  of  the  throughput  versus  the  packet  transmission  delay.  For  packets 
routed  through  the  bridge,  the  maximum  throughput  recorded  was  A  =  4.789  Mbps  and  the  throughput 
efficiency  was  A*//  =  35.56% ,  an  increase  of  1.089  Mbps  and  over  8%  respectively  from  the  preliminary 
implementation.  This  contrasts  with  packets  not  routed  through  the  bridge  (A  =  6.678  Mbps  and 
Xejj  =  49.58%  respectively),  an  increase  of  1.877  Mbps  and  over  14%  respectively  from  the  preliminary 
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) 


implementation. 


Figure  18  Test  E3:  Nonnul  link  throughput  plot  for  crou-trafBc. 

Once  the  maximum  operating  throughput  of  the  switch  was  recorded,  the  switch  was  driven  at  this 
constant  throughput  and  routing  table  updates  were  introduced.  This  test  demonstrates  the  overhead 
required  to  update  external  routing  tables  throughout  the  switch.  Figure  19  shows  a  plot  of  the  throughput 
versus  the  update  transmission  delay.  For  packets  passing  through  the  bridge,  the  maximum  throughput 
recorded  from  the  test  was  A  =  4.711  Mbps  and  the  throughput  efficiency  deteriorated  by  Xjet  =  1.63%. 
This  contrasts  with  packets  not  routed  through  the  bridge  (A  =  4.710  Mbps  and  X^et  =  29.47% 
respectively). 
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Figure  19  Test  E3:  Throughput  w/  Increasing  Routing  Table  Update  Frequency. 


- 


Tests  E4  and  E5:  General  Switch  Operation  The  last  two  tests  were  conducted  to  generate  data 
which  would  reflect  the  packet  routing  switch's  general  operational  characteristics.  Figure  20  shows  a 
plot  of  the  throughput  versus  the  packet  transmission  delay.  The  maximum  throughputs  recorded  from 
the  test  were: 
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HOST 


A  =  5.746  Mbps 


DRIVER  0  A 
DRIVER  1  A 
DRIVER  2  A 
DRIVERS  A 


4.349  Mbps 
4.092  Mbps 
4.127  Mbps 
4.184  Mbps 


UMn  PacM  Oamraaon  Oilair  (NonMi  DMrMhin)  («) 


Figure  20  Test  E4;  Switch  general  operation  throughput  plo>. 


The  configunition  used  for  Test  E5  was  identical  to  Test  E4  with  the  inclusion  of  message  traffic  from 
the  host.  Rgure  21  shows  a  plot  of  the  throughput  versus  the  packet  transmission  delay.  The  maximum 
throughputs  recorded  from  the  test  were: 

HOST  A  =  4.651 
DRIVER  0  X  =  4.690  Mbps 

DRIVER  1  X  =  4.457  Mbps  (6) 

DRIVER  2  A  =  4.427  Mfcps 
DRIVERS  A  =  4.698  Mips 
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Figure  21  Test  E5:  Switch  general  operation  throughput  plot. 


The  results  obtained  from  this  simulation  are  summarized  in  Table  3  which  includes  the  aggregate  data 
rate  for  the  routing  switch  taken  as  a  whole. 
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Tfest 


Transputer  Link  13.468  Mbps 
Throughput 


1  I  11.454  Mbps 


7.856  Mbps 


1 1 .467  Mbps 


Direct  Traffic  (no 
updates) 

A  -  6.678  Mbps 

Direct  Traffic  (w/ 
updates) 

A  =  4.710  Mbps 

Through  Bridge  (no 
updates) 

A  -  4.789  Mbps 

Through  Bridge  (w/ 
updates) 

A  *  4.711  Mbps 

Host 

A  =  5.746  Mbps 

Driver  0 

A  =  4.349  Mbps 

Driver  1 

A  =  4.092  Mbps 

Driver  2 

A  =  4.127  Mbps 

Driver  3 

Host 

A  =  4.651  Mbps 

Driver  0 

A  «  4.690  Mbps 

Driver  1 

A  *  4.457  Mbps 

Driver  2 

A  =  4.427  Mbps 

Driver  3 

A  =  4.698  Mbps 

eff  «  49.58% 


29.47% 


eff  “  35.56% 


Aje,  *  1.63% 


Table  3  Cumulative  results 


6  CONCLUSIONS 


In  this  report,  we  have  described  the  design  process  used  to  implement  a  Transputer  based  packet 
touting  switch  using  cooperating  OCCAM  processes  on  a  network  of  T800  Transputers. 

The  maximum  Transputer  link  throughput  calculated  was  13.468  Mbps.  This  computed  value  is 
cor^sistent  with  the  Inmos  published  link  speed  of  20  Mbps  with  consideration  for  byte-transfer  overhead 
bits.  This  also  compares  very  favorably  with  the  previous  (Genesys  and  C  based)  benchmark  of  only 
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3.81  Mbps.  The  tecordod  throughputs  ate  sufficient  to  warrant  considetatjon  of  the  Transputer  system's 
suitability  to  the  project  testbed. 

The  maximum  switch  throughput  recotxied  was  11.454  Mbps  with  a  throughput  efficiency  of  85.05%. 
The  throughput  and  throughput  efficiency  incficate  that  the  software  design  functions  well  within  the 
required  testbed  range. 

One  significant  change  from  previous  results  was  gathered  during  Test  E2.  In  this  test,  the  throughput 
recorded  was  7.856  Mbps  with  a  throughput  efficiency  of  58.33%.  Contrary  to  results  seen  in  the  original 
and  preliminary  design  tests,  these  throughput  results  ate  less  than  those  of  Test  El  (which  mimics  ideal 
switch  touting  conditions).  The  lower  throughput  in  the  enhanced  design  is  due  to  congestion  at  the 
switch.  The  significant  change  between  models  was  a  decoupling  of  the  communication  channels  to  open 
parallel  channels  of  communication.  This  result  suggests  that  the  effects  of  congestion  become  mote 
prominent  when  greater  concurrency  is  achieved  in  Transputer  systems. 

The  change  in  the  throughput  of  the  switch  traffic  in  Test  E3  was  considered  minimal,  since  routing 
updates  would  not  sustain  such  high  rates  in  an  actual  system.  The  effects  at  normal  update  rates  ate 
considered  negligible.  This  is  not  a  surprising  considering  the  fact  that  a  routing  message  destined  for 
each  internal  TRAM  is  only  5  bytes  long  (including  interface  control  information).  The  processing 
required  to  update  a  node's  external  touting  table  is  also  minimal,  sitKe  all  addresses  ate  stored  as  2-byte 
values  in  an  integer  array. 

The  throughput  results  obtained  after  conducting  Test  E4  and  Test  ES  ate  comparable  to  those  obtained 
from  Test  E2  (the  congested  external  lirdt  test).  This  is  expected  since  all  external  links  of  the  packet 
touting  switch  ate  heavily  loaded  during  Test  E4  and  Test  E5,  similar  to  the  conditions  encountered  during 
Test  E2.  In  Test  E4,  the  differences  can  be  attributed  to  the  additional  presence  of  traffic  on  the  routers 
destLied  for  other  sinks.  In  Test  E5,  further  differences  can  be  attributed  to  the  additional  trpffic  load 
from  the  host  which  may  require  some  rerouting  and  which  competes  directly  with  the  respective  drivers 
load  entering  the  router  network.  The  addition  of  a  packet  generating  source  at  the  host  link  connection 
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increases  the  aggregate  throughput  by  0.425  Ml^.  The  nuninu.1  driver  throughput  results  of  4.092  Mbps 
(Test  E4)  and  4.427  Mbps  (Test  E5)  are  a  significant  improvement  over  the  previous  (Gencsys  and  C 
based)  design  of  only  O.ISS  Mbps  and  0.064  Ml^  respectively. 


The  data  suggests  that  the  design  of  the  switch  yields  throughput  results  that  are  clearly  within  the  op¬ 
erating  range  specified  for  the  testbed  nodes.  Therefore,  the  suitability  of  this  Transputer  implementation 
for  the  project  testbed  is  clear. 
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Appendix  A  CSP  Specifications  for  Satellite  Model 


In  the  following  satellite  model  process  specifications  the  highest  level  of  abstraction  used  parallels 
the  transputer  architecture  presented  in  the  paper.  E)etails  were  omitted  to  focus  on  inter-process  data 
communication  fiows.  More  specifically,  variable  assigmnaits,  loc^  controls,  and  control  signal  traffic 
are  omitted  to  provide  a  clearer  understanding  of  data  traffic  flows. 


SATELLITE  MODEL  SPECIFICATION 
SATELLITE_MODEL  = 

HOST 

I  I 

BRIDGE 

I  I 

ROUTERS 


Host  processes 
in  PARALLEL  with 
Bridge  processes 
in  PARALLEL  with 
Pouter  processes 
in  PARALLEL  with 


DRIVERS 


-  -  Driver  processes 


This  formalism  specifies  the  object  cm  the  left  side  of  the  equality,  namely  SATELLITE.MODEL,  to 
be  composed  of  a  host,  a  bridge,  some  routers,  and  some  drivers.  Abstract  processes  are  identifiable  by 
all  capital  letters  in  their  name.  Underscores  are  used  to  join  multiple  word  names.  The  parallel  bars,  ”11", 
indicate  that  the  processes  are  to  be  executed  concurrently.  At  each  new  level  of  abstraction,  processes 
from  the  previous  level  are  refined  to  add  more  detail,  as  follows; 


HOST  PROCESS  SPECIFICATION 
HOST  = 

MULTI PLEX_HOST_OUTPUTS 

I  I 

GENERATE  ROUTING  PACKETS 


--  Transmit  signals,  data  and 
routing  packets 
--  in  PARALLEL  with 
--  Generate  routing  table  updates 


in  PARALLEL  with 


GENERATE_DATA_PACKETS  --  Generate  internal  data  packets 

II  -  -  in  PARALLEL  with 

RECEIVE_HOST_INPUTS  --  Receiving  signals  &  data  packets 

In  the  following  specificadcMis,  the  formalism  is  used  to  specify  sequential  executicm.  Prefix 
execution  is  specified  with  indicating  subsequent  execution  of  the  following  statement(s),  if  the 
preceeding  evoit  takes  place.  To  smooth  the  effects  of  eliminating  control  information,  simple  recursion 
is  used  and  is  armotated  by  the  formalism  "u.".  The  (cc»idition)*(statements)  formalism  is  used  to  represent 
locking  .  The  ‘^e  branch  <|  condition  |>  false  branch  ”  formalism  represents  ccmditional  branching, 
noting  that  the  true  branch  precedes  the  condition.  **?’’  and  “I”  are  rised  to  indicate  channel  input  and 
output,  respectively.  And,  “I”  is  used  to  denote  choice  by  chaimel  selection. 

U . MULTI PLEX_HOST_OUTPUTS  = 

(internal .channel . 1  ?  --  Receive/Send  route  update 

route .update;routei .id, -router .destination; new. ite  -> 
channel . to .bridge  ! 

route . update; router . id; router .destination; new . r te 
I  --in  ALTERNATE  with 

--  Receive/Send  data  packet 

internal  .channel  .2  ?  external  .data.-destination;  length:  .-data  -> 
channel . to . bridge  !  external .data ;destination; length : :data) ; 

MULTI PLEX_HOST_OUTPUTS 

u . GENERATE_ROUTI NGPACKETS  = 

timer  :=  timer  PLUS  delayl;  --  Add  route  update  interpacket  delay 

timer .channel  ?  AFTER  timer;  --  Delay  until  next  arrival  time 

internal .channel . 1  !  --  Send  routing  table  update 

route . update; brdg; destination; new. route; 

GENERATE  ROUTING  PACKETS 
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GENERATE_DATA_PACKETS  = 

(moie .data . to . send)  *  (  --  While  more  host  data  to  send 

timer  :=  timer  PLUS  delay2;  --  Add  data  interpacket  delay  time 

Delay  until  next  arrival  time 

timer .channel  ?  AFTER  timer; 

internal .channel .2  !  --  Send  external  data  packet 

external .da ta;destination; length; :data) ; 

SKIP 

u . RECEI VE_HOST_INPUTS  =  --  Receive  external  data  packets 

channel .from. bridge  ?  external .data;destination; lenght : :data  -> 
SAVE_THROUGHPUT_DATA;  --  Store  interpacket  timing  data 

RECEIVE  HOST  INPUTS 


BRIDGE  PROCESS  SPECIFICATION 

BRIDGE  = 

MULTI PLEX_TRAFFI C_TO_HOST 

I  I 

DEMULTI PLEX_TRAFFI C_FROM_HOST 

I  I 

MULTI PLEX_ROUTERO_TRAFFI C 

I  I 

MULTI PLEX_ROUTER2_TRAFFI C 

U. MULTIPLEX  TRAFFIC  TO  HOST  = 


Send  Host  control/data  traffic 
in  PARALLEL  with 

Receive  Host  control/data  traffic 
in  PARALLEL  with 

Send  RouterO  control/data  traffic 
in  PARALLEL  with 

Send  Router2  control/data  traffic 

Send  Host  control/data  traffic 


(internal . channel . 1  ?  internal .data;destination; length: :data  -> 
channel . to. host  !  external .data;destination; length: :data 


internal . channel .2  ?  internal .data;destination; length: :data  -> 
channel .  to. host  !  external  .data.-destination;  length:  :data)  ; 

MULTI PLEX_HOST_TRAFFI C 

u.DEMULTIPLEX_TRAFFIC_FROM_HOST  --  Receive  Host  control/data  traffic 
channel .  from. host  ?  external  .data.-destination;  length :  :data  -> 
(internal .channel .3  !  internal .data;destination; length: :data 
< 1  dest  =  RTRO  OR  dest  =  RTRl  | > 

internal .channel .4  !  internal .data;destination; length : :data) ; 
DEMULTI PLEX_TRAFFI C_FROM_HOST 

u.MULTIPLEX_ROUTERO_TRAFFIC  =  --  Send  RoutetO  control/data  traffic 

(channel . from. router2  ?  external .data;destination; length: :data  -> 
channel . to . router 0  !  external .data;destination; length: :data 

I 

channel . from. router2  ?  internal .data;destination; length: :data  -> 
internal .channel . 1  !  internal .data;destination; length: :data 

I 

internal .channel .3  ?  internal .data;destination; length : :data  -> 
channel . to.routerO  :  internal .data;destination; length: :data) ; 

MULTI PLEX_ROUTERO_TRAFFI C 

u. MULTI PLEX_ROUTER2_TRAFFIC  =  -  Send  Router2  control/data  traffic 

(channel . from . router 0  ?  external .data;destination; length : :data  -> 
channel . to . router2  !  external .data/destination; length: :data 

I 

channel . from. routerO  ?  internal .data;destination; length : :data  -> 
internal .channel .2  !  internal .data;de3tination; length: :data 
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I 


internal . channel . 4  ?  internal .da ta;destination; length : :data  -> 
channel . to . router2  !  internal ,data;destination; length : :data) ; 

MULTI PLEX_ROUTER2_TRAFFI C 

ROUTER  PROCESS  SPECIFICATION 

ROUTERS  =  II  (for  i  =  0  to  3)  ROUTER (i) 

The  "(FOR  variable  ■  range)"  formalism  declares  four  processes,  namely  ROUTER(0)  through 
ROUTER(3),  to  be  executed  in  parallel. 

ROUTER (i)  = 

MULTIPLEX_TRAFFIC_TO_DRIVER(i)  -  Send  Driver  control/data  traffic 
II  -  -  in  PARALLEL  with 

ROUTE_NETWORK_TRAFFIC  --  Route  router  thru- traffic 

II  -  -  in  PARALLEL  with 

DEMULTIPLEX_HOST_TRAFFIC  --  Receive  host  traffic 

II  -  -  in  PARALLEL  with 

DEMULTIPLEX_ROUTE_A_TRAFFIC  --  Receive  routeA  traffic 

--  Multiplex  traffic  to  driver 

u . MULTIPLEX_TRAFFIC_TO_DRIVER ( i )  = 

(internal . channel . 1  ?  external .data;destination; length : :data  -> 
channel . to. driver (i)  !  external .data;destination; length: :data 

I 

internal .channel .2  ?  external .data;destination; length: :data  -> 
channel .  to  .dr  iver  (i)  !  external  .data;destination;  length:  .-data 
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I 

internal .channel . 5  ?  external .data;destination; length: :daca  -> 
channel . to. driver (i)  !  external .data;destination; length: :data) ; 

MULTIPLEX_TRAFFIC_TO_DRIVER ( i ) 

u .ROUTE_NETWORK_TRAFFIC  =  --  Route  traffic  to  Other  routers 

(internal .channel .3  ?  internal .route;destination, -route  -> 

( UPDATE_ROUTING_TABLE ; 

channel . to . routeA  !  internal .route;destination;route 
<1  i  =  1  OR  3  I > 

SKIP) 

1 

internal .channel .3  ?  internal ,data;destination; length: :data  -> 
(channel .  to . routeA  !  external  .data.-destination;  length:  :data 
<1  table (destination)  =  RTR(i).RTA  |> 

(internal .channel . 5  !  external .data;destination; length: :data 
<1  table (destination)  =  RTR(i).DRV  |> 

SKIP)  ) 

I 

internal .channel .4  ?  internal .data;destination; length : :data  -> 
channel . to . host  !  internal .data;destination; length: :data 

I 

channel . from. driver (i)  ?  external .data;destination; length: :data  -> 
( (channel . to. bridge  :  internal .da ta;destination; length : :data 
<1  destination  =  HOST  |> 

channel . to . bridge  !  external . da ta;deswtination; length : :data ) 

<1  table (destination)  =  RTR(i).HST  |> 

(channel . to . routeA  !  external .data;destination; length: :data 
<1  table (destination)  =  RTR(i).RTA  |> 
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(channel . to . routeB  !  external .data;destination; length: :data 
<1  table (destination)  =  RTR(i) .RTB  |> 

SKIP) ) ) ) ; 

ROUTE_NETWORK_TRAFFI C 

u .DEMULTIPLEX_HOST_TRAFFIC  =  --  Receive  inputs  from  host  path 

(channel . from. host  ?  internal .route;destination; route  -> 
internal .channel .3  !  internal . route;destination;route 

I 

channel .from. host  ?  external .data;destination; length : :data  -> 
internal .channel . 1  !  external .data;destination; length : :data 

I 

channel . from. host  ?  ' nternal .data;destination; length: :data  -> 
internal .channel .3  !  internal .data;destination; length : :data) ; 
DEMULTI PLEX_HOST_TRAFFIC 

u .DEMULTIPLEX_ROUTE_A_TRAFFIC  =  --  Receive  inputs  from  routeA  path 

(channel . from. routeA  ?  external .data;destination; length: :data  -> 
internal .channel .2  !  external .data;destination; length: :data 
I 

channel . from . routeA  ?  internal .data;destination; length: :data  -> 
internal .channel .4  !  internal .data;destination; length : :data) ; 

DEMULTIPLEX  ROUTE  A  TRAFFIC 


DRIVER  PROCESS  SPECIFICATIONS 


DRIVERS  =  II  (for  i  =  0  to  3)  DRIVER(i) 
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u.DRIVER(i)  = 

(SEND_DRIVER(i)_DATA  --  Send  driver  control/data  traffic 

I  I 

RECEIVE_ROUTER(i)_TRAFFIC) ;  --  Receive  driver  cntrl/data  traffic 

SEND_RESULTS ; 

DRIVER (i) ; 

SEND_DRIVER(i)_DATA  = 

(more .data . to. send)  *  (  --  While  more  host  data  to  send 

timer  =  timer  PLUS  delay;  --  Add  data  interpacket  delay  time 

. timer .channel  ?  AFTER  timer;--  Delay  until  next  arrival  time 
channel . to. router (i)  !  --  Send  external  data  packet 

external .data; destination; length: :data) ; 

SKIP 


u.RECEIVE_ROUTER(i)_TRAFFIC  =  --  Receive  driver  traffic  from  router 

channel . from. router (i)  ?  external .data;destination; length: :data  > 
perform . throughput .calculations ; 

RECEIVE  ROUTER (i)  TRAFFIC 


Appendix  B  Occam  Source  Code  for  Satellite  Model 
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